Stereotyping and Bias in the Flickr30K Dataset
نویسنده
چکیده
In: Proceedings of the Workshop on Multimodal Corpora: Computer vision and language processing (MMC-2016), pages 1–4. Workshop held: 24 May 2016, collocated with LREC 2016, Portorož, Slovenia. Proceedings available at: http://www.lrec-conf.org/proceedings/lrec2016/workshops/ LREC2016Workshop-MCC-2016-proceedings.pdf An untested assumption behind the crowdsourced descriptions of the images in the Flickr30K dataset (Young et al., 2014) is that they “focus only on the information that can be obtained from the image alone” (Hodosh et al., 2013, p. 859). This paper presents some evidence against this assumption, and provides a list of biases and unwarranted inferences that can be found in the Flickr30K dataset. Finally, it considers methods to find examples of these, and discusses how we should deal with stereotype-driven descriptions in future applications.
منابع مشابه
A Social Semiotic Analysis of Social Actors in English-Learning Software Applications
This study drew upon Kress and Van Leeuwen’s (2006, [1996]) visual grammar and Van Leeuwen’s (2008) social semiotic model to interrogate ways through which social actors of different races are visually and textually represented in four award-winning English-learning software packages. The analysis was based on narrative actional/reactional processes at the ideational level; mood, perspective, ...
متن کاملThe Social Effective Factors Involved in Gender Stereotyping Believe between Private and Public Sphere
An important part of understanding women's and men's attitudes and behaviors is considering their ideas and opinions. Based on gender stereotyping of view women and men have different types of behavior, manner and characteristics. Both genders do their jobs differently. Stereotyping process focuses on ability and characteristics of women, and men are lack of them. On the other hand, it also f...
متن کاملSupplementary Material: Natural Language Object Retrieval
In this document, we visualize some results on the ReferIt dataset [1] using our SCRC model, showing that it can correctly retrieve an object by exploiting its description in context. We also evaluate our model on the Flickr30K Entities dataset [2], and show that our model can be applied to both “object” and “stuff”, and can generate descriptions over given image regions. 1. Retrieval on object...
متن کاملAttention Correctness in Neural Image Captioning
Attention Map Visualization We visualize the attention maps of both the implicit attention model and our supervised attention model on the Flickr30k test set. As mentioned in the paper, 909 noun phrases are aligned for the implicit model and 901 for the supervised model. 635 of these alignments are common for both, and 595 of them have corresponding bounding boxes. Here we present a subset due ...
متن کاملGenerating Chinese Captions for Flickr30K Images
We trained a Multimodal Recurrent Neural Network on Flickr30K dataset with Chinese sentences. The RNN model is from Karpathy and Fei-Fei, 2015 [6]. As Chinese sentence has no space between words, we implemented the model on Flickr30 dataset in two methods. In the first setting, we tokenized each Chinese sentence into a list of words and feed them to the RNN. While in the second one, we split ea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1605.06083 شماره
صفحات -
تاریخ انتشار 2016